Working with Strings - Part 2

© 2005, Brad Moore

LB Connection

NL139 Home

::::::::::::::::::::::::::::::::::::::::::::::::::::

Working with Strings - 2

Releasing Your Software

A Numbers Game

Native Lines

Precision Numbers

Graphicbox With Scrollbars

Using Wire

::::::::::::::::::::::::::::::::::::::::::::::::::::

Submission Guildlines

Newsletter Help

Index


Introduction

String variables are wonderful and versatile. They can hold characters and numbers. As many as you want (up to 2Mb) and in any order you want. You can declare the string and assign it a value in a single step. Folks from other languages would envy the power.

As I mentioned in part one of this series, LB has some great functions for working with strings. From these we can build our own tool bag of functions to add greater power and versatility to our programming.

Today we are going to look at the individual characters as well as strings that hold numbers (and how to know whether they are a valid number).

At the heart of understanding string data is understanding the ASCII representation of characters. "ASCII" is an acronym for "American Standard Code for Information Interchange". It is a widely used standard for encoding text documents on computers. In ASCII each character of the written language is represented by a numeric value. We can chart these in what is called an ASCII chart.

The first couple dozen plus characters in ASCII are non-printable and relate back to the early teletype days when electronic communications were done via modem and telex by companies like Western Union and IBM. We still use some of them today in our Liberty Basic programs - like:

13 = Carriage Return

10 = Line Feed

9 = Tab

Here is the definition of the first 32 character of the ASCII table from [http://www.neurophys.wisc.edu/www/comp/docs/ascii.html]:


Decimal  Octal  Hex   Binary        Value
-------  -----  ---   ------        -----
000      000    000   00000000      NUL    (Null char.)
001      001    001   00000001      SOH    (Start of Header)
002      002    002   00000010      STX    (Start of Text)
003      003    003   00000011      ETX    (End of Text)
004      004    004   00000100      EOT    (End of Transmission)
005      005    005   00000101      ENQ    (Enquiry)
006      006    006   00000110      ACK    (Acknowledgment)
007      007    007   00000111      BEL    (Bell)
008      010    008   00001000       BS    (Backspace)
009      011    009   00001001       HT    (Horizontal Tab)
010      012    00A   00001010       LF    (Line Feed)
011      013    00B   00001011       VT    (Vertical Tab)
012      014    00C   00001100       FF    (Form Feed)
013      015    00D   00001101       CR    (Carriage Return)
014      016    00E   00001110       SO    (Shift Out)
015      017    00F   00001111       SI    (Shift In)
016      020    010   00010000      DLE    (Data Link Escape)
017      021    011   00010001      DC1 (XON) (Device Control 1)
018      022    012   00010010      DC2       (Device Control 2)
019      023    013   00010011      DC3 (XOFF)(Device Control 3)
020      024    014   00010100      DC4       (Device Control 4)
021      025    015   00010101      NAK    (Negative Acknowledgement)
022      026    016   00010110      SYN    (Synchronous Idle)
023      027    017   00010111      ETB    (End of Trans. Block)
024      030    018   00011000      CAN    (Cancel)
025      031    019   00011001       EM    (End of Medium)
026      032    01A   00011010      SUB    (Substitute)
027      033    01B   00011011      ESC    (Escape)
028      034    01C   00011100       FS    (File Separator)
029      035    01D   00011101       GS    (Group Separator)
030      036    01E   00011110       RS    (Request to Send)
031      037    01F   00011111       US    (Unit Separator)
032      040    020   00100000       SP    (Space)

Liberty Basic has two functions that are used to go back and forth between the ASCII representation of a character (which is a number) and the actual character itself. These are:

ASC(s$) - returns the ASCII value of the first character of string s$

CHR$(n) - returns a one character long string represented by the ASCII value n.

Using these functions we can create (print) our own ASCII table anytime we need one. I find myself doing this from time to time - it is usually faster to whip up a table than look one up:


'ASCII table from Char 33 - 255
for x = 33 to 255
    print x,chr$(x)
next x

Well that is quick and dirty and gets the job done, but it is not very pretty. I think that using a few extra string functions we can get a nicely formatted table that you can print and keep. We will use the RIGHT$ function along with a little trick to force our ASCII number to be formatted to three fixed characters with leading zeros.

We do this by combining a string of zeros ("000") with the string representation of the number (use the STR$ function for that):


num$ = "000" + STR$(x)

That might yield a "00045" for character number 45. Now take just the last three characters using the RIGHT$ function:


Ascii$ = Right$(num$,3)

We can combine the two lines of code into a single line like this:


Ascii$ = Right$("000" + STR$(x), 3)

To make a real table we should print the items side by side, say eight to a line. We can force them to print side by side by ending our print statement with a semi-colon. This will force the next print statement to place its output immediately following that which we just printed.

By formatting the print line we can get a nice output -


'ASCII table from Char 33 - 255
for x = 33 to 255
    Ascii$ = Right$("000" + STR$(x), 3)
    print Ascii$;" - ";chr$(x);"    ";
next x

But that is not really a table - it is just a long line of text. We need to break after every eight items. We can do this easily by inserting a print statement into the loop that does not have a trailing semi-colon. We can also use the new MOD statement (you will need LB 4.02 for this) to decide when to print this line.


'ASCII table from Char 33 - 255: formatted
for x = 33 to 255
    Ascii$ = Right$("000" + STR$(x), 3)
    print Ascii$;" - ";chr$(x);"    ";
    if x MOD 8 = 0 then print ""
next x

This produces a nice ASCII table like this:

Numeric Strings

Let now consider a string that is supposed to contain a number. When you get input from the user in a freeform format (such as a textbox) you can not be certain that they have complied with the rules of your form. With Liberty Basic you can force Textboxes to be numeric using a stylebit, so this can be overcome in part.

When a user is entering a number and you are storing it in a string, the string can contain characters as well as valid numerals. How will you know whether the value of the string is a valid number? There are several ways:

You could get the numeric value of the string and compare it against the string. Use the VAL function for this:


x = val(a$)
If x = 0 and a$ <> "0" then
	Print "This is not a number"
End if

Let's just check this out - you can exercise the code using this small program:


a$ = "5532"
gosub [checkMe]

a$ = "55eqe32"
gosub [checkMe]

a$ = "dfa5532"
gosub [checkMe]

a$ = "0"
gosub [checkMe]
end

[checkMe]
x = val(a$)
If x = 0 and a$ <> "0" then
    Print a$;" is not a number"
else
    Print a$;" IS a number"
End if
Return

If you observe the results you will see that this sort of works, but is not 100% reliable:


5532 IS a number
55eqe32 IS a number
dfa5532 is not a number
0 IS a number

What we need is a function that will check each character of the string and insure it is a numeric value. Well, having just investigated the ASC and CHR$ functions, we are in a good place to consider this. What we will need is the ability to walk through the string one character at a time and verify whether that character is a valid value for a number.

We know from our ASCII table we created earlier that numbers are ASCII values 48 through 57.

In order to check each character we will need to be able to extract one character from a specific location in the string. We can do this with the MID$ function. From the helpfile:

MID$(string, index, [number])

Description:

This function permits the extraction of a sequence of characters from string starting at index. [number] is optional. If number is not specified, then all the characters from index to the end of the string are returned. If number is specified, then only as many characters as number specifies will be returned, starting from index.

Usage:

print mid$("greeting Earth creature", 10, 5)

Produces:

Earth

We can get the length of the string we are investigating using the LEN function. Also from the helpfile:

LEN( string )

Description:

This function returns the length in characters of string, which can be any valid string expression.

Usage:

prompt "What is your name?"; yourName$

print "Your name is "; len(yourName$); " letters long"

Once we know the length of the string we are trying to validate we can use a FOR-NEXT loop to spin through the string and check each character. Here is some basic code from which we can build our subroutine:


[checkMe]
l = len(a$)
for x = 1 to l
    'this is the character we are checking
    b$ = mid$(a$,x,1)
    'this is the ascii value of the character
    b = asc(b$)
    if b < 48 or b > 57 then
        print a$;" is NOT a number"
        return
    end if
next x
print a$;" IS a number"
return

Let's exercise the subroutine a little. We will use the same code as in our test above:


a$ = "5532"
gosub [checkMe]

a$ = "55eqe32"
gosub [checkMe]

a$ = "dfa5532"
gosub [checkMe]

a$ = "0"
gosub [checkMe]
end

[checkMe]
l = len(a$)
for x = 1 to l
    'this is the character we are checking
    b$ = mid$(a$,x,1)
    'this is the ascii value of the character
    b = asc(b$)
    if b < 48 or b > 57 then
        print a$;" is NOT a number"
        return
    end if
next x
print a$;" IS a number"
return

The results:


5532 IS a number
55eqe32 is NOT a number
dfa5532 is NOT a number
0 IS a number

Now that looks much better. I happen to know we are not all the way there yet though. What if we tested the following values:


 3.452
 -45.33

Both of these valid numbers would fail (go ahead - you try it). The reason: The decimal and the negative sign are not in the range of valid ASCII values. Both of these add complexity to our problem.

A valid number can only have one decimal, so we must test to see not only if there is a decimal, but make sure that there is only one if we encounter such a character. Also, a negative number must have only one negative sign, and that must be in the first character position. Again we must test for this situation.

We have worked through the general processes of developing a function to test all of these items. You should now have the knowledge and with some practice the skill as well to implement a more complex function that can validate a more complex number. Here is my solution in function form:


'---------------------------------------
' isNumeric Function (Public Domain - 2005)
' --------------------------------------

function isNumeric(value$)
'returns a 1 if value is all numeric

isNumeric = 0
decimal = 0
a = len(value$)
if a > 0 then
    isNumeric = 1
    for x = 1 to a
        if asc(mid$(value$,x,1)) = 46 then
            decimal = decimal + 1
            if decimal > 1 then
                isNumeric = 0
                exit for
            end if
        else
            if asc(mid$(value$,x,1)) < 48 or asc(mid$(value$,x,1)) > 57 then
                if x=1 and asc(mid$(value$,x,1)) = 45 then
                    'this number is negative
                    isNumeric = 1
                else
                    'this is not a number
                    isNumeric = 0
                    exit for
                end if
            end if
        end if
    next x
end if

end function

It can be easily tested with the following code (be sure to include the function - as I have not added to the end of this code fragment!). You put this function through its paces and see what you get:


' test the function

print isNumeric("561")
print isNumeric("5d6c1")
print isNumeric("")

a$ = "64664"
gosub [checkNum]
a$ = "64hj23j4"
gosub [checkNum]
a$ = ""
gosub [checkNum]
a$ = "64.664"
gosub [checkNum]
a$ = "6.46.64."
gosub [checkNum]
a$ = "-3.14613"
gosub [checkNum]
a$ = "-3.63.27."
gosub [checkNum]
a$ = "-80000"
gosub [checkNum]
a$ = "-5j34j5.234"
gosub [checkNum]
a$ = "-60-60-6"
gosub [checkNum]
a$ = "582782-"
gosub [checkNum]
end

' ------------------------------------

[checkNum]
print a$;
if isNumeric(a$) then
print " is a number"
else
print " is not a number"
end if
return

And so we come to the end of another in the Working with Strings series. You have a couple more tools for your Liberty Basic Tool belt. Go forth and program!

Brad


NL139 Home

::::::::::::::::::::::::::::::::::::::::::::::::::::

Working with Strings - 2

Releasing Your Software

A Numbers Game

Native Lines

Precision Numbers

Graphicbox With Scrollbars

Using Wire

::::::::::::::::::::::::::::::::::::::::::::::::::::

Submission Guildlines

Newsletter Help

Index